Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

Infusing behavior science into large language models for activity coaching.

Hegde, Narayan; Vardhan, Madhurima; Nathani, Deepak; Rosenzweig, Emily; Speed, Cathy; Karthikesalingam, Alan; Seneviratne, Martin.

PLOS Digit Health ; 3(4): e0000431, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38564502

RESUMO

Large language models (LLMs) have shown promise for task-oriented dialogue across a range of domains. The use of LLMs in health and fitness coaching is under-explored. Behavior science frameworks such as COM-B, which conceptualizes behavior change in terms of capability (C), Opportunity (O) and Motivation (M), can be used to architect coaching interventions in a way that promotes sustained change. Here we aim to incorporate behavior science principles into an LLM using two knowledge infusion techniques: coach message priming (where exemplar coach responses are provided as context to the LLM), and dialogue re-ranking (where the COM-B category of the LLM output is matched to the inferred user need). Simulated conversations were conducted between the primed or unprimed LLM and a member of the research team, and then evaluated by 8 human raters. Ratings for the primed conversations were significantly higher in terms of empathy and actionability. The same raters also compared a single response generated by the unprimed, primed and re-ranked models, finding a significant uplift in actionability and empathy from the re-ranking technique. This is a proof of concept of how behavior science frameworks can be infused into automated conversational agents for a more principled coaching experience.

2.

Publisher Correction: Large language models encode clinical knowledge.

Singhal, Karan; Azizi, Shekoofeh; Tu, Tao; Mahdavi, S Sara; Wei, Jason; Chung, Hyung Won; Scales, Nathan; Tanwani, Ajay; Cole-Lewis, Heather; Pfohl, Stephen; Payne, Perry; Seneviratne, Martin; Gamble, Paul; Kelly, Chris; Babiker, Abubakr; Schärli, Nathanael; Chowdhery, Aakanksha; Mansfield, Philip; Demner-Fushman, Dina; Agüera Y Arcas, Blaise; Webster, Dale; Corrado, Greg S; Matias, Yossi; Chou, Katherine; Gottweis, Juraj; Tomasev, Nenad; Liu, Yun; Rajkomar, Alvin; Barral, Joelle; Semturs, Christopher; Karthikesalingam, Alan; Natarajan, Vivek.

Nature ; 620(7973): E19, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-37500979

3.

Large language models encode clinical knowledge.

Singhal, Karan; Azizi, Shekoofeh; Tu, Tao; Mahdavi, S Sara; Wei, Jason; Chung, Hyung Won; Scales, Nathan; Tanwani, Ajay; Cole-Lewis, Heather; Pfohl, Stephen; Payne, Perry; Seneviratne, Martin; Gamble, Paul; Kelly, Chris; Babiker, Abubakr; Schärli, Nathanael; Chowdhery, Aakanksha; Mansfield, Philip; Demner-Fushman, Dina; Agüera Y Arcas, Blaise; Webster, Dale; Corrado, Greg S; Matias, Yossi; Chou, Katherine; Gottweis, Juraj; Tomasev, Nenad; Liu, Yun; Rajkomar, Alvin; Barral, Joelle; Semturs, Christopher; Karthikesalingam, Alan; Natarajan, Vivek.

Nature ; 620(7972): 172-180, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-37438534

RESUMO

Large language models (LLMs) have demonstrated impressive capabilities, but the bar for clinical applications is high. Attempts to assess the clinical knowledge of models typically rely on automated evaluations based on limited benchmarks. Here, to address these limitations, we present MultiMedQA, a benchmark combining six existing medical question answering datasets spanning professional medicine, research and consumer queries and a new dataset of medical questions searched online, HealthSearchQA. We propose a human evaluation framework for model answers along multiple axes including factuality, comprehension, reasoning, possible harm and bias. In addition, we evaluate Pathways Language Model1 (PaLM, a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM2 on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA3, MedMCQA4, PubMedQA5 and Measuring Massive Multitask Language Understanding (MMLU) clinical topics6), including 67.6% accuracy on MedQA (US Medical Licensing Exam-style questions), surpassing the prior state of the art by more than 17%. However, human evaluation reveals key gaps. To resolve this, we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, knowledge recall and reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLMs for clinical applications.

Assuntos

Benchmarking , Simulação por Computador , Conhecimento , Medicina , Processamento de Linguagem Natural , Viés , Competência Clínica , Compreensão , Conjuntos de Dados como Assunto , Licenciamento , Medicina/métodos , Medicina/normas , Segurança do Paciente , Médicos

4.

The Association Between Central Line-Associated Bloodstream Infection and Central Line Access.

Ward, Andrew; Chemparathy, Augustine; Seneviratne, Martin; Gaskari, Shabnam; Mathew, Roshni; Wood, Matthew; Donnelly, Lane F; Lee, Grace M; Scheinker, David; Shin, Andrew Y.

Crit Care Med ; 51(6): 787-796, 2023 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-36920081

RESUMO

OBJECTIVES: Identifying modifiable risk factors associated with central line-associated bloodstream infections (CLABSIs) may lead to modifications to central line (CL) management. We hypothesize that the number of CL accesses per day is associated with an increased risk for CLABSI and that a significant fraction of CL access may be substituted with non-CL routes. DESIGN: We conducted a retrospective cohort study of patients with at least one CL device day from January 1, 2015, to December 31, 2019. A multivariate mixed-effects logistic regression model was used to estimate the association between the number of CL accesses in a given CL device day and prevalence of CLABSI within the following 3 days. SETTING: A 395-bed pediatric academic medical center. PATIENTS: Patients with at least one CL device day from January 1, 2015, to December 31, 2019. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: There were 138,411 eligible CL device days across 6,543 patients, with 639 device days within 3 days of a CLABSI (a total of 217 CLABSIs). The number of per-day CL accesses was independently associated with risk of CLABSI in the next 3 days (adjusted odds ratio, 1.007; 95% CI, 1.003-1.012; p = 0.002). Of medications administered through CLs, 88% were candidates for delivery through a peripheral line. On average, these accesses contributed a 6.3% increase in daily risk for CLABSI. CONCLUSIONS: The number of daily CL accesses is independently associated with risk of CLABSI in the next 3 days. In the pediatric population examined, most medications delivered through CLs could be safely administered peripherally. Efforts to reduce CL access may be an important strategy to include in contemporary CLABSI-prevention bundles.

Assuntos

Bacteriemia , Infecções Relacionadas a Cateter , Cateterismo Venoso Central , Cateteres Venosos Centrais , Humanos , Criança , Infecções Relacionadas a Cateter/etiologia , Estudos Retrospectivos , Cateterismo Venoso Central/efeitos adversos , Bacteriemia/epidemiologia , Bacteriemia/etiologia , Cateteres Venosos Centrais/efeitos adversos

5.

Grains of Sand to Clinical Pearls: Realizing the Potential of Wearable Data.

Seneviratne, Martin G; Connolly, Susan B; Martin, Seth S; Parakh, Kapil.

Am J Med ; 136(2): 136-142, 2023 02.

Artigo em Inglês | MEDLINE | ID: mdl-36351523

RESUMO

Despite the rapid growth of wearables as a consumer technology sector and a growing evidence base supporting their use, they have been slow to be adopted by the health system into clinical care. As regulatory, reimbursement, and technical barriers recede, a persistent challenge remains how to make wearable data actionable for clinicians-transforming disconnected grains of wearable data into meaningful clinical "pearls". In order to bridge this adoption gap, wearable data must become visible, interpretable, and actionable for the clinician. We showcase emerging trends and best practices that illustrate these 3 pillars, and offer some recommendations on how the ecosystem can move forward.

Assuntos

Dispositivos Eletrônicos Vestíveis , Humanos , Areia , Ecossistema

6.

User-centred design for machine learning in health care: a case study from care management.

Seneviratne, Martin G; Li, Ron C; Schreier, Meredith; Lopez-Martinez, Daniel; Patel, Birju S; Yakubovich, Alex; Kemp, Jonas B; Loreaux, Eric; Gamble, Paul; El-Khoury, Kristel; Vardoulakis, Laura; Wong, Doris; Desai, Janjri; Chen, Jonathan H; Morse, Keith E; Downing, N Lance; Finger, Lutz T; Chen, Ming-Jun; Shah, Nigam.

BMJ Health Care Inform ; 29(1)2022 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-36220304

RESUMO

OBJECTIVES: Few machine learning (ML) models are successfully deployed in clinical practice. One of the common pitfalls across the field is inappropriate problem formulation: designing ML to fit the data rather than to address a real-world clinical pain point. METHODS: We introduce a practical toolkit for user-centred design consisting of four questions covering: (1) solvable pain points, (2) the unique value of ML (eg, automation and augmentation), (3) the actionability pathway and (4) the model's reward function. This toolkit was implemented in a series of six participatory design workshops with care managers in an academic medical centre. RESULTS: Pain points amenable to ML solutions included outpatient risk stratification and risk factor identification. The endpoint definitions, triggering frequency and evaluation metrics of the proposed risk scoring model were directly influenced by care manager workflows and real-world constraints. CONCLUSIONS: Integrating user-centred design early in the ML life cycle is key for configuring models in a clinically actionable way. This toolkit can guide problem selection and influence choices about the technical setup of the ML problem.

Assuntos

Aprendizado de Máquina , Design Centrado no Usuário , Atenção à Saúde , Humanos , Dor , Fluxo de Trabalho

7.

Expanding the Secondary Use of Prostate Cancer Real World Data: Automated Classifiers for Clinical and Pathological Stage.

Bozkurt, Selen; Magnani, Christopher J; Seneviratne, Martin G; Brooks, James D; Hernandez-Boussard, Tina.

Front Digit Health ; 4: 793316, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35721793

RESUMO

Background: Explicit documentation of stage is an endorsed quality metric by the National Quality Forum. Clinical and pathological cancer staging is inconsistently recorded within clinical narratives but can be derived from text in the Electronic Health Record (EHR). To address this need, we developed a Natural Language Processing (NLP) solution for extraction of clinical and pathological TNM stages from the clinical notes in prostate cancer patients. Methods: Data for patients diagnosed with prostate cancer between 2010 and 2018 were collected from a tertiary care academic healthcare system's EHR records in the United States. This system is linked to the California Cancer Registry, and contains data on diagnosis, histology, cancer stage, treatment and outcomes. A randomly selected sample of patients were manually annotated for stage to establish the ground truth for training and validating the NLP methods. For each patient, a vector representation of clinical text (written in English) was used to train a machine learning model alongside a rule-based model and compared with the ground truth. Results: A total of 5,461 prostate cancer patients were identified in the clinical data warehouse and over 30% were missing stage information. Thirty-three to thirty-six percent of patients were missing a clinical stage and the models accurately imputed the stage in 21-32% of cases. Twenty-one percent had a missing pathological stage and using NLP 71% of missing T stages and 56% of missing N stages were imputed. For both clinical and pathological T and N stages, the rule-based NLP approach out-performed the ML approach with a minimum F1 score of 0.71 and 0.40, respectively. For clinical M stage the ML approach out-performed the rule-based model with a minimum F1 score of 0.79 and 0.88, respectively. Conclusions: We developed an NLP pipeline to successfully extract clinical and pathological staging information from clinical narratives. Our results can serve as a proof of concept for using NLP to augment clinical and pathological stage reporting in cancer registries and EHRs to enhance the secondary use of these data.

8.

Development and Implementation of a Real-time Bundle-adherence Dashboard for Central Line-associated Bloodstream Infections.

Chemparathy, Augustine; Seneviratne, Martin G; Ward, Andrew; Mirchandani, Simran; Li, Ron; Mathew, Roshni; Wood, Matthew; Shin, Andrew Y; Donnelly, Lane F; Scheinker, David; Lee, Grace M.

Pediatr Qual Saf ; 6(4): e431, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34235355

RESUMO

INTRODUCTION: Central line-associated bloodstream infections (CLABSIs) are the most common hospital-acquired infection in pediatric patients. High adherence to the CLABSI bundle mitigates CLABSIs. At our institution, there did not exist a hospital-wide system to measure bundle-adherence. We developed an electronic dashboard to monitor CLABSI bundle-adherence across the hospital and in real time. METHODS: Institutional stakeholders and areas of opportunity were identified through interviews and data analyses. We created a data pipeline to pull adherence data from twice-daily bundle checks and populate a dashboard in the electronic health record. The dashboard was developed to allow visualization of overall and individual element bundle-adherence across units. Monthly dashboard accesses and element-level bundle-adherence were recorded, and the nursing staff's feedback about the dashboard was obtained. RESULTS: Following deployment in September 2018, the dashboard was primarily accessed by quality improvement, clinical effectiveness and analytics, and infection prevention and control. Quality improvement and infection prevention and control specialists presented dashboard data at improvement meetings to inform unit-level accountability initiatives. All-element adherence across the hospital increased from 25% in September 2018 to 44% in December 2019, and average adherence to each bundle element increased between 2018 and 2019. CONCLUSIONS: CLABSI bundle-adherence, overall and by element, increased across the hospital following the deployment of a real-time electronic data dashboard. The dashboard enabled population-level surveillance of CLABSI bundle-adherence that informed bundle accountability initiatives. Data transparency enabled by electronic dashboards promises to be a useful tool for infectious disease control.

9.

Multitask prediction of organ dysfunction in the intensive care unit using sequential subnetwork routing.

Roy, Subhrajit; Mincu, Diana; Loreaux, Eric; Mottram, Anne; Protsyuk, Ivan; Harris, Natalie; Xue, Yuan; Schrouff, Jessica; Montgomery, Hugh; Connell, Alistair; Tomasev, Nenad; Karthikesalingam, Alan; Seneviratne, Martin.

J Am Med Inform Assoc ; 28(9): 1936-1946, 2021 08 13.

Artigo em Inglês | MEDLINE | ID: mdl-34151965

RESUMO

OBJECTIVE: Multitask learning (MTL) using electronic health records allows concurrent prediction of multiple endpoints. MTL has shown promise in improving model performance and training efficiency; however, it often suffers from negative transfer - impaired learning if tasks are not appropriately selected. We introduce a sequential subnetwork routing (SeqSNR) architecture that uses soft parameter sharing to find related tasks and encourage cross-learning between them. MATERIALS AND METHODS: Using the MIMIC-III (Medical Information Mart for Intensive Care-III) dataset, we train deep neural network models to predict the onset of 6 endpoints including specific organ dysfunctions and general clinical outcomes: acute kidney injury, continuous renal replacement therapy, mechanical ventilation, vasoactive medications, mortality, and length of stay. We compare single-task (ST) models with naive multitask and SeqSNR in terms of discriminative performance and label efficiency. RESULTS: SeqSNR showed a modest yet statistically significant performance boost across 4 of 6 tasks compared with ST and naive multitasking. When the size of the training dataset was reduced for a given task (label efficiency), SeqSNR outperformed ST for all cases showing an average area under the precision-recall curve boost of 2.1%, 2.9%, and 2.1% for tasks using 1%, 5%, and 10% of labels, respectively. CONCLUSIONS: The SeqSNR architecture shows superior label efficiency compared with ST and naive multitasking, suggesting utility in scenarios in which endpoint labels are difficult to ascertain.

Assuntos

Aprendizado de Máquina , Insuficiência de Múltiplos Órgãos , Registros Eletrônicos de Saúde , Humanos , Unidades de Terapia Intensiva , Redes Neurais de Computação

10.

Use of deep learning to develop continuous-risk models for adverse event prediction from electronic health records.

Tomasev, Nenad; Harris, Natalie; Baur, Sebastien; Mottram, Anne; Glorot, Xavier; Rae, Jack W; Zielinski, Michal; Askham, Harry; Saraiva, Andre; Magliulo, Valerio; Meyer, Clemens; Ravuri, Suman; Protsyuk, Ivan; Connell, Alistair; Hughes, Cían O; Karthikesalingam, Alan; Cornebise, Julien; Montgomery, Hugh; Rees, Geraint; Laing, Chris; Baker, Clifton R; Osborne, Thomas F; Reeves, Ruth; Hassabis, Demis; King, Dominic; Suleyman, Mustafa; Back, Trevor; Nielson, Christopher; Seneviratne, Martin G; Ledsam, Joseph R; Mohamed, Shakir.

Nat Protoc ; 16(6): 2765-2787, 2021 06.

Artigo em Inglês | MEDLINE | ID: mdl-33953393

RESUMO

Early prediction of patient outcomes is important for targeting preventive care. This protocol describes a practical workflow for developing deep-learning risk models that can predict various clinical and operational outcomes from structured electronic health record (EHR) data. The protocol comprises five main stages: formal problem definition, data pre-processing, architecture selection, calibration and uncertainty, and generalizability evaluation. We have applied the workflow to four endpoints (acute kidney injury, mortality, length of stay and 30-day hospital readmission). The workflow can enable continuous (e.g., triggered every 6 h) and static (e.g., triggered at 24 h after admission) predictions. We also provide an open-source codebase that illustrates some key principles in EHR modeling. This protocol can be used by interdisciplinary teams with programming and clinical expertise to build deep-learning prediction models with alternate data sources and prediction tasks.

Assuntos

Aprendizado Profundo , Registros Eletrônicos de Saúde , Projetos de Pesquisa , Medição de Risco/métodos , Humanos , Software , Fluxo de Trabalho

11.

Assessment of a Clinical Trial-Derived Survival Model in Patients With Metastatic Castration-Resistant Prostate Cancer.

Coquet, Jean; Bievre, Nicolas; Billaut, Vincent; Seneviratne, Martin; Magnani, Christopher J; Bozkurt, Selen; Brooks, James D; Hernandez-Boussard, Tina.

JAMA Netw Open ; 4(1): e2031730, 2021 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-33481032

RESUMO

Importance: Randomized clinical trials (RCTs) are considered the criterion standard for clinical evidence. Despite their many benefits, RCTs have limitations, such as costliness, that may reduce the generalizability of their findings among diverse populations and routine care settings. Objective: To assess the performance of an RCT-derived prognostic model that predicts survival among patients with metastatic castration-resistant prostate cancer (CRPC) when the model is applied to real-world data from electronic health records (EHRs). Design, Setting, and Participants: The RCT-trained model and patient data from the RCTs were obtained from the Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge for prostate cancer, which occurred from March 16 to July 27, 2015. This challenge included 4 phase 3 clinical trials of patients with metastatic CRPC. Real-world data were obtained from the EHRs of a tertiary care academic medical center that includes a comprehensive cancer center. In this study, the DREAM challenge RCT-trained model was applied to real-world data from January 1, 2008, to December 31, 2019; the model was then retrained using EHR data with optimized feature selection. Patients with metastatic CRPC were divided into RCT and EHR cohorts based on data source. Data were analyzed from March 23, 2018, to October 22, 2020. Exposures: Patients who received treatment for metastatic CRPC. Main Outcomes and Measures: The primary outcome was the performance of an RCT-derived prognostic model that predicts survival among patients with metastatic CRPC when the model is applied to real-world data. Model performance was compared using 10-fold cross-validation according to time-dependent integrated area under the curve (iAUC) statistics. Results: Among 2113 participants with metastatic CRPC, 1600 participants were included in the RCT cohort, and 513 participants were included in the EHR cohort. The RCT cohort comprised a larger proportion of White participants (1390 patients [86.9%] vs 337 patients [65.7%]) and a smaller proportion of Hispanic participants (14 patients [0.9%] vs 42 patients [8.2%]), Asian participants (41 patients [2.6%] vs 88 patients [17.2%]), and participants older than 75 years (388 patients [24.3%] vs 191 patients [37.2%]) compared with the EHR cohort. Participants in the RCT cohort also had fewer comorbidities (mean [SD], 1.6 [1.8] comorbidities vs 2.5 [2.6] comorbidities, respectively) compared with those in the EHR cohort. Of the 101 variables used in the RCT-derived model, 10 were not available in the EHR data set, 3 of which were among the top 10 features in the DREAM challenge RCT model. The best-performing EHR-trained model included only 25 of the 101 variables included in the RCT-trained model. The performance of the RCT-trained and EHR-trained models was adequate in the EHR cohort (mean [SD] iAUC, 0.722 [0.118] and 0.762 [0.106], respectively); model optimization was associated with improved performance of the best-performing EHR model (mean [SD] iAUC, 0.792 [0.097]). The EHR-trained model classified 256 patients as having a high risk of mortality and 256 patients as having a low risk of mortality (hazard ratio, 2.7; 95% CI, 2.0-3.7; log-rank P < .001). Conclusions and Relevance: In this study, although the RCT-trained models did not perform well when applied to real-world EHR data, retraining the models using real-world EHR data and optimizing variable selection was beneficial for model performance. As clinical evidence evolves to include more real-world data, both industry and academia will likely search for ways to balance model optimization with generalizability. This study provides a pragmatic approach to applying RCT-trained models to real-world data.

Assuntos

Tomada de Decisões Assistida por Computador , Modelos Estatísticos , Neoplasias de Próstata Resistentes à Castração/mortalidade , Adolescente , Adulto , Idoso , Registros Eletrônicos de Saúde , Humanos , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Prognóstico , Neoplasias de Próstata Resistentes à Castração/diagnóstico , Neoplasias de Próstata Resistentes à Castração/epidemiologia , Ensaios Clínicos Controlados Aleatórios como Assunto , Análise de Sobrevida , Adulto Jovem

12.

A decade follow-up: On the prevalence, distribution and clinical correlates of myocardial fibrosis, as detected by cardiac magnetic resonance, in systemic lupus erythematosus.

Verma, Rohan; Balaraju, Varunan; Seneviratne, Martin; Garsia, Roger; Adelstein, Stephen; Puranik, Rajesh; Dennis, Mark.

Lupus ; 29(14): 1981-1983, 2020 12.

Artigo em Inglês | MEDLINE | ID: mdl-33040648

Assuntos

Cardiomiopatias , Lúpus Eritematoso Sistêmico , Cardiomiopatias/diagnóstico por imagem , Cardiomiopatias/epidemiologia , Fibrose , Humanos , Lúpus Eritematoso Sistêmico/diagnóstico , Lúpus Eritematoso Sistêmico/epidemiologia , Imageamento por Ressonância Magnética , Espectroscopia de Ressonância Magnética , Prevalência

13.

Reporting of demographic data and representativeness in machine learning models using electronic health records.

Bozkurt, Selen; Cahan, Eli M; Seneviratne, Martin G; Sun, Ran; Lossio-Ventura, Juan A; Ioannidis, John P A; Hernandez-Boussard, Tina.

J Am Med Inform Assoc ; 27(12): 1878-1884, 2020 12 09.

Artigo em Inglês | MEDLINE | ID: mdl-32935131

RESUMO

OBJECTIVE: The development of machine learning (ML) algorithms to address a variety of issues faced in clinical practice has increased rapidly. However, questions have arisen regarding biases in their development that can affect their applicability in specific populations. We sought to evaluate whether studies developing ML models from electronic health record (EHR) data report sufficient demographic data on the study populations to demonstrate representativeness and reproducibility. MATERIALS AND METHODS: We searched PubMed for articles applying ML models to improve clinical decision-making using EHR data. We limited our search to papers published between 2015 and 2019. RESULTS: Across the 164 studies reviewed, demographic variables were inconsistently reported and/or included as model inputs. Race/ethnicity was not reported in 64%; gender and age were not reported in 24% and 21% of studies, respectively. Socioeconomic status of the population was not reported in 92% of studies. Studies that mentioned these variables often did not report if they were included as model inputs. Few models (12%) were validated using external populations. Few studies (17%) open-sourced their code. Populations in the ML studies include higher proportions of White and Black yet fewer Hispanic subjects compared to the general US population. DISCUSSION: The demographic characteristics of study populations are poorly reported in the ML literature based on EHR data. Demographic representativeness in training data and model transparency is necessary to ensure that ML models are deployed in an equitable and reproducible manner. Wider adoption of reporting guidelines is warranted to improve representativeness and reproducibility.

Assuntos

Demografia , Registros Eletrônicos de Saúde , Aprendizado de Máquina , Etnicidade , Feminino , Humanos , Masculino , Inquéritos Nutricionais , Fatores Socioeconômicos

14.

Development and validation of phenotype classifiers across multiple sites in the observational health data sciences and informatics network.

Kashyap, Mehr; Seneviratne, Martin; Banda, Juan M; Falconer, Thomas; Ryu, Borim; Yoo, Sooyoung; Hripcsak, George; Shah, Nigam H.

J Am Med Inform Assoc ; 27(6): 877-883, 2020 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-32374408

RESUMO

OBJECTIVE: Accurate electronic phenotyping is essential to support collaborative observational research. Supervised machine learning methods can be used to train phenotype classifiers in a high-throughput manner using imperfectly labeled data. We developed 10 phenotype classifiers using this approach and evaluated performance across multiple sites within the Observational Health Data Sciences and Informatics (OHDSI) network. MATERIALS AND METHODS: We constructed classifiers using the Automated PHenotype Routine for Observational Definition, Identification, Training and Evaluation (APHRODITE) R-package, an open-source framework for learning phenotype classifiers using datasets in the Observational Medical Outcomes Partnership Common Data Model. We labeled training data based on the presence of multiple mentions of disease-specific codes. Performance was evaluated on cohorts derived using rule-based definitions and real-world disease prevalence. Classifiers were developed and evaluated across 3 medical centers, including 1 international site. RESULTS: Compared to the multiple mentions labeling heuristic, classifiers showed a mean recall boost of 0.43 with a mean precision loss of 0.17. Performance decreased slightly when classifiers were shared across medical centers, with mean recall and precision decreasing by 0.08 and 0.01, respectively, at a site within the USA, and by 0.18 and 0.10, respectively, at an international site. DISCUSSION AND CONCLUSION: We demonstrate a high-throughput pipeline for constructing and sharing phenotype classifiers across sites within the OHDSI network using APHRODITE. Classifiers exhibit good portability between sites within the USA, however limited portability internationally, indicating that classifier generalizability may have geographic limitations, and, consequently, sharing the classifier-building recipe, rather than the pretrained classifiers, may be more useful for facilitating collaborative observational research.

Assuntos

Registros Eletrônicos de Saúde/classificação , Informática Médica , Aprendizado de Máquina Supervisionado , Classificação/métodos , Ciência de Dados , Humanos , Estudos Observacionais como Assunto

15.

Predicting the Incidence of Pressure Ulcers in the Intensive Care Unit Using Machine Learning.

Cramer, Eric M; Seneviratne, Martin G; Sharifi, Husham; Ozturk, Alp; Hernandez-Boussard, Tina.

EGEMS (Wash DC) ; 7(1): 49, 2019 Sep 05.

Artigo em Inglês | MEDLINE | ID: mdl-31534981

RESUMO

BACKGROUND: Reducing hospital-acquired pressure ulcers (PUs) in intensive care units (ICUs) has emerged as an important quality metric for health systems internationally. Limited work has been done to characterize the profile of PUs in the ICU using observational data from the electronic health record (EHR). Consequently, there are limited EHR-based prognostic tools for determining a patient's risk of PU development, with most institutions relying on nurse-calculated risk scores such as the Braden score to identify high-risk patients. METHODS AND RESULTS: Using EHR data from 50,851 admissions in a tertiary ICU (MIMIC-III), we show that the prevalence of PUs at stage 2 or above is 7.8 percent. For the 1,690 admissions where a PU was recorded on day 2 or beyond, we evaluated the prognostic value of the Braden score measured within the first 24 hours. A high-risk Braden score (<=12) had precision 0.09 and recall 0.50 for the future development of a PU. We trained a range of machine learning algorithms using demographic parameters, diagnosis codes, laboratory values and vitals available from the EHR within the first 24 hours. A weighted linear regression model showed precision 0.09 and recall 0.71 for future PU development. Classifier performance was not improved by integrating Braden score elements into the model. CONCLUSION: We demonstrate that an EHR-based model can outperform the Braden score as a screening tool for PUs. This may be a useful tool for automatic risk stratification early in an admission, helping to guide quality protocols in the ICU, including the allocation and timing of prophylactic interventions.

16.

Machine Learning Approaches for Extracting Stage from Pathology Reports in Prostate Cancer.

Lenain, Raphael; Seneviratne, Martin G; Bozkurt, Selen; Blayney, Douglas W; Brooks, James D; Hernandez-Boussard, Tina.

Stud Health Technol Inform ; 264: 1522-1523, 2019 Aug 21.

Artigo em Inglês | MEDLINE | ID: mdl-31438212

RESUMO

Clinical and pathological stage are defining parameters in oncology, which direct a patient's treatment options and prognosis. Pathology reports contain a wealth of staging information that is not stored in structured form in most electronic health records (EHRs). Therefore, we evaluated three supervised machine learning methods (Support Vector Machine, Decision Trees, Gradient Boosting) to classify free-text pathology reports for prostate cancer into T, N and M stage groups.

Assuntos

Aprendizado de Máquina , Neoplasias da Próstata , Registros Eletrônicos de Saúde , Humanos , Masculino

17.

Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment.

Banerjee, Imon; Li, Kevin; Seneviratne, Martin; Ferrari, Michelle; Seto, Tina; Brooks, James D; Rubin, Daniel L; Hernandez-Boussard, Tina.

JAMIA Open ; 2(1): 150-159, 2019 04.

Artigo em Inglês | MEDLINE | ID: mdl-31032481

RESUMO

Background: The population-based assessment of patient-centered outcomes (PCOs) has been limited by the efficient and accurate collection of these data. Natural language processing (NLP) pipelines can determine whether a clinical note within an electronic medical record contains evidence on these data. We present and demonstrate the accuracy of an NLP pipeline that targets to assess the presence, absence, or risk discussion of two important PCOs following prostate cancer treatment: urinary incontinence (UI) and bowel dysfunction (BD). Methods: We propose a weakly supervised NLP approach which annotates electronic medical record clinical notes without requiring manual chart review. A weighted function of neural word embedding was used to create a sentence-level vector representation of relevant expressions extracted from the clinical notes. Sentence vectors were used as input for a multinomial logistic model, with output being either presence, absence or risk discussion of UI/BD. The classifier was trained based on automated sentence annotation depending only on domain-specific dictionaries (weak supervision). Results: The model achieved an average F1 score of 0.86 for the sentence-level, three-tier classification task (presence/absence/risk) in both UI and BD. The model also outperformed a pre-existing rule-based model for note-level annotation of UI with significant margin. Conclusions: We demonstrate a machine learning method to categorize clinical notes based on important PCOs that trains a classifier on sentence vector representations labeled with a domain-specific dictionary, which eliminates the need for manual engineering of linguistic rules or manual chart review for extracting the PCOs. The weakly supervised NLP pipeline showed promising sensitivity and specificity for identifying important PCOs in unstructured clinical text notes compared to rule-based algorithms.

18.

Merging heterogeneous clinical data to enable knowledge discovery.

Seneviratne, Martin G; Kahn, Michael G; Hernandez-Boussard, Tina.

Pac Symp Biocomput ; 24: 439-443, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-30864344

RESUMO

The vision of precision medicine relies on the integration of large-scale clinical, molecular and environmental datasets. Data integration may be thought of along two axes: data fusion across institutions, and data fusion across modalities. Cross-institutional data sharing that maintains semantic integrity hinges on the adoption of data standards and a push toward ontology-driven integration. The goal should be the creation of query-able data repositories spanning primary and tertiary care providers, disease registries, research organizations etc. to produce rich longitudinal datasets. Cross-modality sharing involves the integration of multiple data streams, from structured EHR data (diagnosis codes, laboratory tests) to genomics, imaging, monitors and patient-generated data including wearable devices. This integration presents unique technical, semantic, and ethical challenges; however recent work suggests that multi-modal clinical data can significantly improve the performance of phenotyping and prediction algorithms, powering knowledge discovery at the patient- and population-level.

Assuntos

Big Data , Disseminação de Informação/métodos , Descoberta do Conhecimento/métodos , Biologia Computacional , Humanos , Medicina de Precisão/métodos , Medicina de Precisão/estatística & dados numéricos , Estados Unidos

19.

Distribution of global health measures from routinely collected PROMIS surveys in patients with breast cancer or prostate cancer.

Seneviratne, Martin G; Bozkurt, Selen; Patel, Manali I; Seto, Tina; Brooks, James D; Blayney, Douglas W; Kurian, Allison W; Hernandez-Boussard, Tina.

Cancer ; 125(6): 943-951, 2019 03 15.

Artigo em Inglês | MEDLINE | ID: mdl-30512191

RESUMO

BACKGROUND: The collection of patient-reported outcomes (PROs) is an emerging priority internationally, guiding clinical care, quality improvement projects and research studies. After the deployment of Patient-Reported Outcomes Measurement Information System (PROMIS) surveys in routine outpatient workflows at an academic cancer center, electronic health record data were used to evaluate survey completion rates and self-reported global health measures across 2 tumor types: breast and prostate cancer. METHODS: This study retrospectively analyzed 11,657 PROMIS surveys from patients with breast cancer and 4411 surveys from patients with prostate cancer, and it calculated survey completion rates and global physical health (GPH) and global mental health (GMH) scores between 2013 and 2018. RESULTS: A total of 36.6% of eligible patients with breast cancer and 23.7% of patients with prostate cancer completed at least 1 survey, with completion rates lower among black patients for both tumor types (P < .05). The mean T scores (calibrated to a general population mean of 50) for GPH were 48.4 ± 9 for breast cancer and 50.6 ± 9 for prostate cancer, and the GMH scores were 52.7 ± 8 and 52.1 ± 9, respectively. GPH and GMH were frequently lower among ethnic minorities, patients without private health insurance, and those with advanced disease. CONCLUSIONS: This analysis provides important baseline data on patient-reported global health in breast and prostate cancer. Demonstrating that PROs can be integrated into clinical workflows, this study shows that supportive efforts may be needed to improve PRO collection and global health endpoints in vulnerable populations.

Assuntos

Neoplasias da Mama/epidemiologia , Neoplasias da Próstata/epidemiologia , Centros Médicos Acadêmicos , Adulto , Idoso , Idoso de 80 Anos ou mais , Neoplasias da Mama/etnologia , Registros Eletrônicos de Saúde/estatística & dados numéricos , Feminino , Inquéritos Epidemiológicos/estatística & dados numéricos , Humanos , Masculino , Saúde Mental , Pessoa de Meia-Idade , Medidas de Resultados Relatados pelo Paciente , Neoplasias da Próstata/etnologia , Estudos Retrospectivos , Autorrelato

20.

Architecture and Implementation of a Clinical Research Data Warehouse for Prostate Cancer.

Seneviratne, Martin G; Seto, Tina; Blayney, Douglas W; Brooks, James D; Hernandez-Boussard, Tina.

EGEMS (Wash DC) ; 6(1): 13, 2018 Jun 01.

Artigo em Inglês | MEDLINE | ID: mdl-30094285

RESUMO

BACKGROUND: Electronic health record (EHR) based research in oncology can be limited by missing data and a lack of structured data elements. Clinical research data warehouses for specific cancer types can enable the creation of more robust research cohorts. METHODS: We linked data from the Stanford University EHR with the Stanford Cancer Institute Research Database (SCIRDB) and the California Cancer Registry (CCR) to create a research data warehouse for prostate cancer. The database was supplemented with information from clinical trials, natural language processing of clinical notes and surveys on patient-reported outcomes. RESULTS: 11,898 unique prostate cancer patients were identified in the Stanford EHR, of which 3,936 were matched to the Stanford cancer registry and 6153 in the CCR. 7158 patients with EHR data and at least one of SCIRDB and CCR data were initially included in the warehouse. CONCLUSIONS: A disease-specific clinical research data warehouse combining multiple data sources can facilitate secondary data use and enhance observational research in oncology.

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA